Skip to main content

Create couples datasets

Microdata.no offers a number of demographic variables that make it possible to find, among other things, spouses, cohabitants and partners, and use this to link information about both members of the couple for use in various analyses.

The variable BEFOLKNING_REGSTAT_FAMNR contains the family identity number of each individual, where all family members have the same value for the family number. This can be used to create family information or to aggregate data up to family level. And since the family number is defined by the personal identity number of the oldest person in the family, you can use this in combination with other family information to find people who live together as a couple (spouses, cohabitants, partners):

  1. Start by creating a data set consisting of information about:

    • Family number
    • Personal code
    • Couple status (optional)
    • Desired information additional information, e.g. occupation
  2. Select the oldest person in each family. Use the variable personcode for this (personcode = "1")

  3. Create a new data set where you import the same variables, but where you select the youngest person in each family (personcode = "2"). Children are kept out as they have their own personal code (personcode = "3").

  4. Move over again to the data set with the oldest, and link the relevant variables together with the data set for the youngest. Since the youngest have information about who is the oldest in the family (= the family number), this can be used as a linking key and the result is a data set for all couples.

  5. Finally, move over to the dataset of the youngest. This now contains all your data for both couple members, for every couples in your population.

 //Connect to database
require no.ssb.fdb:30 as ds

//Create dataset consisting of oldest person in each family
create-dataset oldest
import ds/BEFOLKNING_REGSTAT_FAMNR 2021-01-01 as famnr
import ds/BEFOLKNING_REGSTAT_PERSONKODE 2021-01-01 as personcode_oldest
import ds/BEFOLKNING_PARSTATUS 2021-01-01 as couplestatus_oldest
import ds/REGSYS_ARB_YRKE_STYRK08 2020-11-16 as occupation_oldest
keep if personcode_oldest == '1'

//Create dataset consisting of youngest person in each family, that are not children
create-dataset youngest
import ds/BEFOLKNING_REGSTAT_FAMNR 2021-01-01 as famnr
import ds/BEFOLKNING_REGSTAT_PERSONKODE 2021-01-01 as personcode_youngest
import ds/BEFOLKNING_PARSTATUS 2021-01-01 as couplestatus_youngest
import ds/REGSYS_ARB_YRKE_STYRK08 2020-11-16 as occupation_youngest
keep if personcode_youngest == '2'

//Use dataset with oldest persons and link the variables into the dataset with the youngest persons via family number (family number = oldest person id-number)
use oldest
merge personcode_oldest couplestatus_oldest occupation_oldest into youngest on famnr 

//The dataset youngest now contains data on both spouses/couple members for all couples in the population. Run controle tabulations
use youngest
tabulate personcode_youngest, missing
tabulate personcode_oldest, missing
tabulate couplestatus_youngest, missing
tabulate couplestatus_oldest, missing
tabulate occupation_oldest, missing
tabulate occupation_youngest, missing

//Check if both, one or none of the spouses/couple members are working
generate job_youngest = 1 if sysmiss(occupation_youngest) == 0
generate job_oldest = 1 if sysmiss(occupation_oldest) == 0
tabulate job_oldest job_youngest, missing